In
the mid- to late 1990s, storage was not a real issue because most
organizations didn’t need to store large amounts of data or archives.
This is not the case today, as there is a great need for storage and
archiving. The demand for storage and archiving, coupled with the high
availability of storage, has increased exponentially. Networked
attached storage (NAS), storage area networks (SANs), and technologies
such as Fibre Channels and others used to be available only in
enterprise class storage devices; now you can get this functionality at
the server level. Windows Server 2008 includes a massive number of
improvements to its storage features, making storage decisions easier
for the administrator and resulting in a more stable and more available
infrastructure for users.
RAID Types
RAID
(Redundant Array of Inexpensive Disks) provides higher levels of
reliability, performance, and storage capacity, all at a lower cost. It
compromises out of multiple disk drives (an array). These
fault-tolerant arrays are classified into six different RAID levels,
numbered 0 through 5. Each level uses a different set of storage
algorithms to implement fault tolerance.
There
are two types of RAID: hardware RAID and software RAID. Hardware RAID
will always have a disk controller dedicated to the RAID to which you
can cable up the disk drives. Software RIAD is more flexible and less
expensive, but requires more CPU cycles and power to run. It also
operates on a partition-by-partition grouping basis as opposed to
hardware RAID systems, which group together entire disk drives.
RAID
0 is an array of disks implemented with disk striping. This means that
if there are two disks in the array it will offer two times the speed
of one disk, which offers no drive redundancy or fault tolerance. The
only advantage it offers is speed.
RAID
1 is an array of disks implemented in a mirror; this means that one
disk will be a copy of the other disk. Each time any data gets written
to the disk, the system must write the same information to both disks.
To increase system performance in RAID 1, you need to implement a
duplex RAID 1 system. This means that each mirrored array of disks will
have its own host adapter.
RAID
2 is an array of disks implemented with disk striping with added error
correction code (ECC) disks. Each time any data is written to the array
these codes are calculated and will be written alongside the data on
the ECC disks to confirm that no errors have occurred from the time
when the data was written.
RAID
3 is an array of disks implemented with disk striping and a dedicated
disk for parity information. Because RAID 3 uses bit striping, its
write and read performance is rather slow compared to RAID 4.
RAID
4 is an array of disks implemented with disk striping and a dedicated
disk for parity information. It is similar to RAID 3, bit it performs
block striping or sector striping instead of bit striping. Thus, with
RAID 4 one entire sector is written to one drive and the next sector is
written to the next drive.
RAID
5 is an array of disks implemented with disk striping and a disk for
parity also striped across all disks. RAID 5 is handles small amounts
of information efficiently. This is the preferred option when setting
up fault tolerance.
RAID
6 is the same as RAID 5, with the added feature of calculating two sets
of parity information and striping it across all drives. This allows
for the failure of two disks but decreases performance slightly.
Nested
RAID 01 and 10 combine the best features of both RAID technologies.
RAID 01 is a mirror of two striped sets and RAID 10 is a stripe of
mirrored sets.
RAID 3,
RAID 4, and RAID 5 disk array designs allow for data recovery. When one
of the drives in the array becomes faulty in any way, the parity disk
is able to rebuild the faulty drive in the array.
Network Attached Storage
NAS
is a technology that delivers storage over the network. An NAS will be
attached directly to the organization’s network and will reduce the
shortcomings previously experienced in a normal LAN. These shortcomings
were:
The rise of storage capacity needs
The rise of protection and security to the data stored
Management complexity for the system administrator
The
NAS could be seen as a solution to these challenges. With the added
benefit of being attached directly to the organization’s IP network, it
becomes accessible to all computers that are part of the network. NAS
devices or servers are designed for their simplicity of deployment,
plugged into the network without interfering with other services. NAS
devices are mostly maintenance free and managing them is minimal due to
their scaled-down functionality on the software side. The scaled-down
operating system and other software on the NAS unit offer data storage
and access functionality and management. Configuring the NAS unit is
mostly done through a Web interface, as opposed to being connected to
it directly. A typical NAS unit will contain one or more hard disks
normally configured into a logical RAID layout. NAS provides storage
and a file system with a file-based protocol such as Network File
System (NFS) or Server Message Block (SMB).
The benefits that come with a NAS are as follows:
Extra networked storage capacity
Its own CPU, motherboard, RAM, etc.
Built-in RAID
Expandability
Potential drawbacks of NAS include:
You
can include an NAS as part of a more comprehensive solution such as a
SAN. In an NAS, file serving is much faster as the file I/O is not
competing for server resources compared to a file server hosting other
solutions. You also can use NAS as centralized storage for managing
backups or other operating system data.
Storage Area Networks
A
SAN is architecture connected to the organization’s LAN. This
architecture could consist of numerous types of vendor and/or sizes of
disk arrays, or even tape libraries. Connecting disk arrays to the
organization’s LAN using a high-speed medium (Fibre Channel or Gigabit
Ethernet) typically through SAN using Fibre Channel switches to servers
benefits the organization by increasing storage capacity, as multiple
servers can share the storage for common file sharing, e-mail servers,
or database servers. Large enterprises have been benefiting from SAN
technologies in which the storage is separate, from being physically
connected to servers to being attached directly to the network, for
many years. SANs are highly scalable and provide flexible storage
allocation, better storage deployment, and higher efficiency backup
solutions which can span over a WAN.
Traditionally,
SANs have been difficult to deploy and maintain. Ensuring high data
availability and optimal resource usage out of the storage array
connected to switches in the middle, as well as monitoring the physical
network, has become a full-time job and requires different skills than
managing storage on a server. As a result, small to medium-size
businesses have started to need SAN technology, and with the different
set of skills required it has proven difficult to implement and manage.
Fibre Channel, being the established storage technology during the past
decade, made this almost impossible for smaller businesses. With this
in mind, other well-known IP technologies are now becoming a viable
option when using iSCSI. Simplifying a SAN does not mean removing the
current SAN infrastructure. It means hiding the complexity of managing
such a storage solution by implementing a technology such as iSCSI.
Figure 1 shows the differences between Direct Attached Storage (DAS), NAS, and SAN.
SAN benefits include the following:
Simplified administration
Storage flexibility
Servers that boot from SAN
Efficient data recovery
Storage replication
iSCSI protocols developed to allow SAN extension over IP networks, resulting in less costly remote backups
The core SAN Fibre Channel infrastructure uses a technology called fabric technology.
This is designed to handle storage communications and it provides a
reliable level of data storage compared to an NAS. In addition, it
allows many-to-many communication in the SAN infrastructure. A typical
fabric is made up of a number of Fibre Channel switches.
Fibre Channel
The
Fibre Channel Protocol (FCP) is the interface protocol used to talk to
SCSI on the Fibre Channel. The Fibre Channel has the following three
topologies, just like in a network topology design. The topologies
designate how ports are connected. In Fibre Channel terms, a port is a
device connected to the network.
Point-to-Point (FC-P2P) Two networked devices connected back to back.
Arbitral loop (FC-AL)
All networked devices connected in a loop. This is similar to the token
ring network topology, and carries the same advantages and
disadvantages.
Switched fabric (FC-SW) All networked devices connected to Fibre Channel switches.
The
line speed rates for Fibre Channel can be anything from 1 GB per second
up to 10 GB per second, depending on the topology and hardware
implemented. The Fibre Channel layers start with a Physical layer
(FC0), which includes the cables, fiber optics, and connectors. The
Data Link layer (FC1) implements encoding and decoding. The Network
layer (FC2) is the core of the Fibre Channel and defines the protocols.
The Common services layer (FC3) could include functions such as RAID
encryption. The Protocol Mapping layer (FC4) is used for protocol
encapsulation. The following ports are defined in the Fibre Channel:
N_port The node port
NL_port The node loop port
F_port The fabric port
FL_port The fabric loop port
E_port An expansion port used to link two Fibre Channels
EX_port A connection between the router and switch
TE_port Provides trunking expansion, with e_ports trunked together
G_port A generic port
L_port The loop port
U_port A universal port
The
Fibre Channel Host Bus Adapter (HBA) has a unique World Wide Name
(WWN); this is comparable to a network card’s Media Access Control
(MAC) address. The HBA installs into a server, like any other network
card or SCSI host adapter.
iSCSI
Internet
Small Computer System Interface (iSCSI) is a very popular SAN protocol,
utilizing attached storage with the illusion of locally attached disks.
It is unlike Fibre Channel, which requires special fibre cabling. You
can use iSCSI to use storage located anywhere in the LAN as part of the
SAN over an existing infrastructure VPN or Ethernet. In essence, iSCSI
allows a server and a RAID array to communicate using SCSI commands
over the IP network. iSCSI requires no additional cabling, and as a
result, iSCSI is the low-cost alternative to Fibre Channel.
iSCSI Initiators and Targets
iSCSI
uses both initiators and targets. The initiator acts as the traditional
SCSI bus adapter, sending SCSI commands. There are two broad types of
initiator: software initiator and hardware initiator.
The
software initiator implements iSCSI, using an existing network stack
and a network interface card (NIC) to emulate a SCSI device. The
software initiator is available in the Windows 2008 operating system. Figure 2 shows the iSCSI Initiator Properties page.
The
hardware initiator uses dedicated hardware to implement iSCSI. Run by
firmware on the hardware, it alleviates the overhead placed on iSCSI
and Transmission Control Protocol (TCP) processing. The HBA is a
combination of a NIC and SCSI bus adapter within the hardware
initiator. If a client requests data from a RAID array, the operating
system does not have to generate the SCSI commands and data requests;
the hardware initiator will.
The
iSCSI target represents hard disk storage and is available in the
Windows Server 2008 operating system. A storage array is an example of
an iSCSI target. A Logical Unit Number (LUN) symbolizes an individual
SCSI device. The initiator will talk to the target to connect to a LUN,
emulating a SCSI hard disk. The iSCSI system will actually have the
functionality to format and manage a file system on the iSCSI LUN.
When iSCSI is used in a SAN it is referred to by special iSCSI names. iSCSI provides the following three name structures:
iSCSI Qualified Name (IQN)
Extended Unique Identifier (EUI)
T11 Network Address Authority (NAA)
An iSCSI participant is usually defined by three or four fields:
Hostname or IP address (e.g., iscsi.syngress.com)
Port number (e.g., 3260)
iSCSI name (e.g., the IQN iqn.2008-02.com.syngess:01.acd4ab21.fin256)
Optional CHAP secret (e.g., chapsecret)
Now
that the iSCSI initiators and targets have names, they have to prove
their identity; they do this by using the Challenge-Handshake
Authentication Protocol (CHAP). This prevents cleartext identity from
taking place. In addition to using CHAP for securing identity
handshaking, you can also use IPSec over the IP-based protocol. To
ensure that traffic flowing between initiators and targets is as secure
as possible, the SAN is run in a logically isolated network segment.
Additionally,
as with all IP-based protocols, IPSec can be used at the network layer.
The iSCSI negotiation protocol is designed to accommodate other
authentication schemes, though interoperability issues limit their
deployment. This eliminates most of the security concerns for important
data traveling on the IP LAN. The other security concern is for servers
to initiate to the storage array, without it being authorized. Regular
audits and checks have to be put in place to ensure that initiators
that are authenticated to an array are legitimately initiated to a LUN.
Targets
can be much more than a RAID array. If a physical device with a SCSI
parallel interface or a Fibre Channel interface gets bridged by using
iSCSI target software, it can also become part of the SAN. Virtual Tape
Libraries (VTLs) are used in a disk storage scenario for storing data
to virtual tapes. Security surveillance with IP-enabled cameras can be
the initiator targeting iSCSI RAID as a target to store hours of
quality video for later processing.
Mount Points
One
of the benefits of using NTFS is having the ability to use volume mount
points. A volume mount point is essentially placed in a directory on a
current volume (hard disk). For example, this means that a folder on
the C: drive called “removable” can be made the mount point to the new
hard drive you have added to the computer. The “removable” folder will
be the gateway to store data on the newly added volume. The volume to
be mounted can be formatted in a number of file systems, including
NTFS, FAT16, FAT32, CDFS, or UDF.
To
better understand volume mount points, consider this scenario. A user
has installed the computer operating system on a relatively small C:
drive and is concerned about unnecessarily using up storage space on
the C: drive which will be needed by the Windows operating system
itself. The user works with large motion graphics files. Knowing that
these files can consume a lot of storage space, the user creates a
volume mount point to the C: drive called “motion”. The user then
configures the motion graphics application to store the motion graphic
files under c:/motion. This means that the files are not using up
valuable storage space on the C: drive, but are actually using storage
space on the new volume mount point.
1. | Create
an empty folder on the NTFS formatted C: drive, called “mount point”
(this folder name can be whatever you want; it doesn’t have to be mount
point).
|
2. | Open Computer Management and select Disk Management.
|
3. | Right-click the new volume (e.g., the newly added 40 GB partition or physical drive) and select Change Drive Letter and Path, as shown in Figure 3.
|
4. | Click Add, and select Mount into the following empty NTFS folder.
|
5. | Browse to the empty NTFS folder on the C: drive and select OK.
|
6. | Figure 4
shows what the result will look like. The “mount point” folder in the
C: drive with a drive icon is a mount point to the physical drive or
partition that was selected in Disk Management. The result is that now
you have an extra 40 GB of storage mounted to the C: drive that you can
use.
|
7. | To remove the mount point from the selected folder, follow the same steps and choose Remove
from the menu in step 4. Removing the mount point does not remove the
folder originally created, nor does it remove the files stored in the
physical disk. You can mount the drive again, or you can assign another
drive letter to the drive to access the files on the drive. |